A Novel Nonparallel Plane Proximal SVM for Imbalance Data Classification
نویسندگان
چکیده
The research of imbalance data classification is the hot point in the field of data mining. Conventional classifiers are not suitable to the imbalanced learning tasks since they tend to classify the instances to the majority class which is the less important class. This paper pays close attention to the uniqueness of uneven data distribution in imbalance classification problems. Without change the original imbalance training data, this paper indicated the advantages of proximal classifier for imbalance data classification. In order to improve the accuracy of classification, this paper proposed a new model named LSNPPC, based the classical proximal SVM models which find two nonparallel planes for data classification. The LS-NPPC model is applied to six UCI datasets and one real application. The results indicate the effectiveness of the proposed model for imbalanced data classification problems.
منابع مشابه
A Sparse Twin SVM for multi-classification problems
We propose Sparse TSVM, a multi-class SVM classifier that determines k nonparallel planes by solving k related SVM-type problems. The Sparse TSVM promotes Twin SVM to one-versus-rest approach. And it capture classes' main feature better with the sparse algorithm. On several benchmark data sets, Sparse TSVM is not only fast, but shows good generalization.
متن کاملDetection of Horizontal Gene Transfer in Bacterial Genomes
Most bacterial genes were acquired by horizontal gene transfer (HGT) from other prokaryotic organisms instead of being inherited by continuous vertical descent from an ancient ancestor. HGT is generally believed to be a major factor in microbiology evolution, allowing rapid diversification and adaptation. In this paper, we artificially simulate HGT by inserting phage genes into bacterial genome...
متن کاملEnhancing the Performance of SVM on Skewed Data Sets by Exciting Support Vectors
In pattern recognition and data mining a data set is named skewed or imbalanced if it contains a large number of objects of certain type and a very small number of objects of the opposite type. The imbalance in data sets represents a challenging problem for most classification methods, this is because the generalization power achieved for classic classifiers is not good for skewed data sets. Ma...
متن کاملFuzzy Least Squares Twin Support Vector Machines
Least Squares Twin Support Vector Machine (LSTSVM) is an extremely efficient and fast version of SVM algorithm for binary classification. LSTSVM combines the idea of Least Squares SVM and Twin SVM in which two nonparallel hyperplanes are found by solving two systems of linear equations. Although, the algorithm is very fast and efficient in many classification tasks, it is unable to cope with tw...
متن کاملParallel selective sampling method for imbalanced and large data classification
Several applications aim to identify rare events from very large data sets. Classification algorithms may present great limitations on large data sets and show a performance degradation due to class imbalance. Many solutions have been presented in literature to deal with the problem of huge amount of data or imbalancing separately. In this paper we assessed the performances of a novel method, P...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- JSW
دوره 9 شماره
صفحات -
تاریخ انتشار 2014